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(57) Abstract: The invention provides a method for preparing a double stranded nucleic acid which encodes an immunoglobulin, 
comprising the steps of: (a) providing a set of three or more overlapping oligonucleotides which anneal to form the + and - strands of 
a nucleic acid which encodes at least part of an immunoglobulin variable domain; (b) annealing the oligonucleotides; (c) replicating 
the + and - strands of the nucleic acid formed from the annealed oligonucleotides; and (d) inserting the nucleic acid into an expression 
vector. 
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Method for Generating Immunoglobulin Genes 



Field of the Invention 

The present invention relates to the de novo synthesis of immunoglobulin genes and the 
5 generation of libraries of diversified immunoglobulin genes. In particular, the invention 
describes the assembly of antibody genes from oligonucleotides and the footprint 
mutagenesis of framework and/or CDR regions to generate diversified antibody libraries. 

Introduction 

10 Nucleic acids for the production of immunoglobulins by recombinant DNA 

technology have, in the prior art, been isolated by cloning from natural sources, such as 
from mouse or human tissues. See, for example, European Patent Application 0 368 684 
(MRC/Winter), which describes the cloning of human antibody genes and the 
construction of libraries of human antibodies. Whilst these methods are suitable for the 

15 generation of large libraries of antibody genes based on natural antibodies, such as human 
antibodies, they do not address the creation of libraries of immunoglobulins not normally 
seen in nature. Moreover, the techniques for the introduction of diversity into cloned 
libraries rely on random or semi-random introduction of mutations, which is an inefficient 
process and results in the production of very large libraries - up to 10 14 members in size - 

20 containing a large number of inoperative mutants which are incapable of binding to any 
target. 

In nature, antibodies operate are secreted by antibody-producing cells and operate 
extracellularly. However, in recent years the use of antibodies intracellularly, in the 
25 cytoplasm and/or the nucleus of cells, has presented a number of advantages in biological 
ant therapeutic applications. Intracellular antibodies are not produced naturally, and 
hence there is no natural source of intracellular antibody genes which may be used as a 
basis for cloning and library preparation in the conventional manner. 

30 Intracellular antibodies or intrabodies are antibody fragments that are used inside cells for 
interaction with target antigens and for interference with cellular function or in some 
cases to mediate cell killing following antigen binding. Intrabodies have particular 
promise in the area of functional genomics where the genome sequence projects are 
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generating a plethora of open reading frames (Ors) for which no functional data are 
available. Intrabodies have a role in defining these functions, especially where protein 
interactions have been defined. In therapeutics, the use of intracellular antibodies for 
functional ablation has been described and should be an invaluable format for disease- 
5 specific reagents. 

Intracellular antibodies are typically formulated as single chain Fv (scFv) fragments 
which comprise immunoglobulin variable (V) regions of heavy (H) and light (L) chains 
held together by a short linker REF. In the prior art, antigen-specific hybridomas have 

10 been used as a source of antibody genes from which scFv have been made for in cell 
expression as intrabodies, and many successes have been reported in which cellular 
phenotypes have been obtained due to scFv-antigen binding REF. Conversion of 
hybridoma antibodies into intracellular antibody fragments is laborious as this strategy 
requires an antigen-specific hybridoma from which the scFv derivative must be active in 

1 5 the intracellular milieu (which is a reducing environment). 

Several different methods have been used to directly develop intrabodies without the need 
to go through hybridomas. These include genetic screening for intrabody-antigen 
interaction REF and use of fixed scFv frameworks for intrabody libraries REF. In the 
20 former approach, the intracellular antibody capture (IAC) technology REF facilitated the 
identification of consensus frameworks comprising residues from Vh and Vl which are 
most commonly found in selected intracellular antibodies. When intracellular antibodies 
based on these scaffolds are expressed in mammalian cells, they were found to be soluble, 
well expressed and functionally efficient REF. 

25 

Since intracellular antibodies appear to require a very specific scaffold structure, the 
cloning of antibody genes as a basis for constructing libraries of intrabodies is not 
feasible. Thus, alternative methods are required for the construction of intracellular 
antibody libraries. 

30 
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Summary of the Invention 

In our copending UK patent applications entitled "Intracellular Antibodies" and "Anti- 
activated RAS antibodies", filed on even date herewith, we have demonstrated that the 
5 IAC consensus frameworks can be used to convert poor intracellular antibodies into 
efficient ones by mutating framework residues to the IAC consensus whilst leaving the 
complementarity determining regions (CDR) intact. In addition, intrabody libraries have 
been made from consensus frameworks and these can be directly screened for 
intracellular binders. 

10 

We show herein that it is possible to build an intrabody library with only the knowledge 
of the intracellular antibody consensus sequence, without resorting to any pre-existing 
antibody gene clones. Two methods are described to achieve this goal. The first is de 
novo antibody gene synthesis in which consensus scFv sequences are used to generate 
15 oligonucleotides for gene synthesis and the second is the use of these cloned intracellular 
antibody genes for CDR or framework diversification using an approach called footprint 
mutagenesis. 

The methods of the invention are widely applicable to the construction of 
20 immunoglobulin genes and the diversification of immunoglobulin libraries, for example 
for the isolation of antibodies having a desired specificity and/or affinity maturation of 
existing antibody clones. The invention is useful whether the immunoglobulins are 
artificial and cannot be cloned, such as intrabodies, or are natural and can be cloned. The 
method of the invention is faster and less laborious than cloning-based methods, and 
25 provides leaner, more focussed libraries with fewer inoperative or redundant members. 

According to a first aspect, therefore, there is provided a method for preparing a double 
stranded nucleic acid which encodes an immunoglobulin, comprising the steps of: 

(a) providing a set of three or more overlapping oligonucleotides which anneal to 
30 form the +* and - strands of a nucleic acid which encodes at least part of an 

immunoglobulin variable domain; 

(b) annealing the oligonucleotides; 
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(c) replicating the + and - strands of the nucleic acid formed from the annealed 
oligonucleotides; and 

(d) inserting the nucleic acid into an expression vector. 

5 Preferably, the oligonucleotide set encodes an entire Vh or Vl domain, or both. In an 
advantageous embodiment, the set encodes a scFv molecule comprising both V H and Vl 
domains, and a linker coupling said domains. In a further embodiment, the 
oligonucleotides encode a single domain antibody or DAb. 

Oligonucleotides may also be provided which encode one or more constant region 
domains, or other effector groups or labels, which are attached to the immunoglobulin 
molecule. For example, the oligonucleotides may encode all or part of an antibody Fc 
region, a label such as GFP, or an effector group such as a toxin. The method may 
moreover be used to construct multiple domain immunoglobulins, such as bispecific 
antibodies. 

Thus, where the nucleic acid molecule encodes "at least part" of an immunoglobulin 
variable domain, it will be understood that the nucleic acid molecule preferably encodes 
an entire variable domain, and may encode additional domains or further polypeptides. 
20 Advantageously, the nucleic acid molecule encodes the entirety of the gene product 
which it is desired to produce; thus, it may encode an entire scFv, and entire DAb, an 
entire V H or Vl domain, or any of these which has been modified by addition of one or 
more constant region domains, effector groups or labels. 

25 In a preferred aspect, the invention further comprises the steps of: 

(e) amplifying the nucleic acid molecule encoding the immunoglobulin variable 
domain using a first set of at least four primers, of which two primers are overlapping, 
and wherein one of the overlapping primers consists of a plurality of different primers, 
each of which comprises a mutagenic region which generates a mutation in a part of the 

30 nucleic acid sequence; 

(f) purifying the amplification products thus obtained; 
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(g) assembling the amplification products in a further amplification reaction using 
a second set of primers which encompass the entire nucleic acid molecule which encodes 
the immunoglobulin variable domain; and 

(h) inserting the further amplification product into an expression vector. 

5 

Advantageously, the nucleic acid molecule and further amplification product which is 
finally obtained after the mutagenesis procedure encode the gene product which it is 
desired to obtain in its entirety, as specified above. 

10 Preferably, steps (e) to (g) are repeated, using different primers, to mutate different 
regions of the nucleic acid molecule and thus generate a library of diversified nucleic acid 
molecules which encompass mutations in a plurality of defined areas. 

In an advantageous embodiment, primers are used which comprise a multiple codon 
equivalence of randomised or semi-randomised sequence. For instance, these can be 
designed to cover a CDR, thus providing the ability to create a diversified CDR and thus 
a library of antibody molecules having a diversified CDR. Moreover, multiple primer 
sets may be used simultaneously to diversify two or more separate regions, such as for 
example two or more CDRs, in order to provide a further diversified library. 

Where two or more primers are used, for example to diversify two or more regions in the 
immunoglobulin chain, the PGR products generated advantageously overlap. The overlap 
is preferably in a region complementary to both primers. Advantageously the overlap 
does not include a diversified region. 

Multiple codon equivalence may extend over two to twelve or more codons; preferably, it 
extends to between three and ten codons. 

Where separate PGR products are generated they are assembled using flanking primers 
30 which amplify the whole region of interest; the flanking primers used may be the same 
and/or different to the diversification primers used, but preferably comprise no diversified 
sequence. 
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In a further aspect, the invention provides a method for preparing a library of nucleic 
acids encoding a diversified immunoglobulin, comprising the steps of: 

(a) amplifying a nucleic acid molecule encoding the immunoglobulin using a first 
set of at least four primers, of which two primers are overlapping, and wherein one of the 

5 overlapping primers consists of a plurality of different primers, each of which comprises a 
mutagenic region which generates a mutation in a part of the nucleic acid sequence; 

(b) purifying the amplification products thus obtained; 

(c) assembling the amplification products in a further amplification reaction using 
a second set of primers which encompass the entire nucleic acid molecule which encodes 

10 the immunoglobulin; and 

(d) inserting the assembled amplification product into an expression vector. 

The method of this aspect of the invention provides a means to generate diversified 
immunoglobulin libraries without recourse to phage display or other expression display 
1 5 methodologies . 

The invention is particularly useful for the production of antibodies and antibody 
libraries, in particular libraries of antibody variable domains. These may be configured as 
scFv, Fabs or other fragments. Particularly preferred are single domain antibodies or 
20 DAbs. 

Advantageously, the invention is useful for the generation of intracellular antibodies, such 
as intracellular scFv or intracellular DAbs. 

25 The invention accordingly provides a library of nucleic acids prepared according to the 
method of the invention. Advantageously, it is a library of intracellular antibodies. 

Libraries of intracellular antibodies are advantageously generated by diversifying one, 
two or three CDRs on the intracellular antibody consensus framework described in our 
30 copending international patent application PCT/GB02/003 5 1 2. This framework provides 
a basis on which intracellular antibodies may be based and reliably function in an 
intracellular environment; the use of the methods of the present invention allows 
diversification of the CDRs on the basis of this framework and thus the generation 
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libraries of intracellular immunoglobulins having low redundancy and reduced incidence 
of intracellular incompatibility. 

Brief Description of the Figures 

5 

Figure 1. De novo antibody gene synthesis 

A. Flow diagram of de novo antibody gene synthesis. Step 1. Oligonucleotides 
corresponding to both strands of the desired antibody fragment (in this case an scFv but 
could be VH or VL alone) are mixed, annealed and ligated. Step 2. PCR amplification of 

10 whole scFv is achieved using flanking primers (conseSFI + conseNOT) carrying Sfil or 
Notl sites. Step 3. The PCR product is cleaved with Sfil and Notl and cloned into a 
compatible vector, in this case pEF-VP16. This vector was constructed from 
pEF/myc/cyto (Invitrogen) by addition of the VP 16 transcriptional activation domain and 
mutation of the Sfil site for compatibility with most scFv cloned sequences (see 

15 methods). 

B. Sequence of hybrid scFv and location of oligonucleotides used for gene synthesis. 
The design of scFvconR4 was using the intracellular antibody capture consensus 
framework sequence (Tse, E., Lobato, M. N., Forster, A., Tanaka, T., Chung, T. Y. G. 
and Rabbitts, T. H. (2002) J. Mol BioL, 317, 85-94) with the VH and VIA CDR 

20 sequences from anti-pgal scFv R4 (Martineau, P., Jones, P. and Winter, G. (1998) J Mol 
Biol, 280, 1 17-127). CDR regions are in yellow. 

C. Mammalian cell reporter assay for scFv intrabody activity. CHO-CD4 cells (Fearon, 
E. R., Finkel, T., Gillison, M. L., Kennedy, S. P., Casella, J. F., Tomaselli, G. F., Morrow, 
J. S. and Dang, C. V. (1992) Proc. Natl Acad Set USA, 89, 7958-7962) were transfected 

25 as indicated and stimulation of CD4 surface expression was measured using a FACS with 
anti-human CD4 antibody and FITC-anti-mouse antibody. Cells were transfected with 
combinations of plasmids encoding DBD-pgal (lacZ), scFvR4-VP16 (scFvR4), DBD- 
RASG12V (RASG12V) and/or scFvconR4-VP16 (scFvconR4) as indicated. 

30 Figure 2. Footprint mutagenesis of scFv frameworks 

Specific framework residues of the intrabody scFv33 (see our copending UK patent 
application "Anti-activated RAS antibodies" filed on even date herewith) was mutated to 
those of scFvI21 to yield scFvI21R33 (16) by stepwise footprint mutagenesis. 
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A. At each step of mutagenesis, the scFv template, cloned in pEF-VP16, was PCR 
amplified using a fixed primer (EFFP2 or VP162R) together with a mutant primer 
(indicated as - A «) at a specific position; this yields two PCR fragments which are 
assembled with EFFP + VP162R primers and cloned into pEF-VP16 for the next round of 
mutagenesis. 

B. Nucleotide and derived protein sequences of scFv33, indicating the amino acid 
residues mutated in the stepwise footprint mutagenesis to scFvI21R33. The PCR primers 
are shown above (red; forward primers) and below (blue; reverse primers) the template 
sequence. The CDR regions are highlighted in yellow and linker between VH and VL in 
grey. The substituted amino acid residues are shown in italics. 

Figure 3. Preparation of scFv intrabodies with randomised CDR3 using footprint 
mutagenesis 

The intracellular antibody scFvA25, recognising the BCR-ABL oncogenic protein (Tse, 
R, Lobato, M. N., Forster, A., Tanaka, T., Chung, T. Y. G. and Rabbitts, T. H. (2002) J, 
Mol Biol, 317, 85-94), was used as a template for a diversified library with randomised 
mutations of the VH CDR3 region. 

A. The scFvA25 was cloned into the pEF-VP16 vector and two footprint mutagenesis 
PCR reactions carried out with primers EFFP2 + A25C3B (in which the central region of 
the primer has n=3, 6 or 10 to randomise amino acid residues in CDR3 of the VH 
domain). The two PCR products were mixed, assembled and cloned into the yeast vector 
pVP16*. CDRs are highlighted in yellow. 

B. The sequence of A25 VH CDR3 region (highlighted in yellow) and PCR primers 
A25CD3F + A25C3B N . 

C. The DNA sequences of randomly selected clones from each 1 st PCR were obtained 
and the derived VH CDR3 protein sequences of these clones are shown (highlighted in 
yellow). 

Figure 4. Diversification of VH CDR1, 2 and 3 using footprint mutagenesis 
A. Footprint mutagenesis to generate an V-region segment with randomised CDR 
regions. The template illustrated is a VH segment and the CDR2 and CDR3 regions are 
mutated as shown. Step 1. Two PCR reactions were carried out with EFFP2 + CDR2R 
(which randomises CDR2 as shown in B)) and CDR2F 4- CDR3R (which randomises 
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CDR3 as shown in B)). Step 2. The two reaction products were assembled into a 
complete VH sequence using EFFP2 + JH5R flanking primers and this in turn amplified 
with partially nested primers EFFP + NOTVHJR1 to incorporate Sfil and Notl restriction 
sites. Step 3. Cloning PCR products into yeast pVP16* (Visintin, M., Tse, E. ? Axelson, 
H., Rabbitts, T. H. and Cattaneo, A. (1999) Proc. Natl Acad Set USA, 96, 11723- 
1 1728). CDRs are highlighted in yellow. Forward primers are shown in red and reverse 
primers in blue. 

B. This illustrates a VH domain depicting framework regions (FR) and complementarity 
determining regions (CDR; highlighted in yellow) with the PCR oligonucleotide 
sequences used mutagenesis. For first round mutagenesis, CDR2 and CDR3 were 
simultaneously changed as indicated. For second round mutagenesis, CDR1 was changed 
as indicated. 

C. Derived protein sequences of the CDR regions of five selected clones made by 
mutation of CDR2 + CDRS compared with the CDR1/2/3 sequences (top line) of the 
canonical consensus VH sequence (CDR2/3) or five selected clones made by 
randomisation of CDR1 from a library of 3.4 X 10 6 clones with mutated CDR2 + CDR3 
(CDR1/2/3); CDR residues are highlighted in grey. The regions randomised in the PCR 
oligonucleotides is shown in the second line, with mutant residues highlighted in grey. 

Figure 5 shows the Alignment of derived protein sequences of intracellular scFv. 
The nucleotide sequences of the scFv were obtained and the derived protein translations 
(shown in the single letter code) were aligned. The complementarity determining regions 
(CDR) are shaded. Framework residues for SEQ no 1 to 40 are those which are 
underlined. The consensus sequence at a specific position was calculated for the most 
frequently occurring residue but only conferred if a residue occurred greater than 5 times 
at that position. 

A. Sequences of VH and VL from anti-BCR (designated as B3-B89) and anti-ABL 
(designated as A5-A32). The combined consensus (Con) of the anti-BCR and ABL 
ICAbs is indicated compared with the subgroup consensuses forVH3 and V K I from 
the Kabat database . 

- Represents sequence identity with the intracellular antibody binding Vh or Vl consensus 
(SEQ. ID. No. 3 and SEQ. ID. No. 4 respectively) 
. represents gaps introduced to optimise alignment 
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B. A sequence comparison of randomly obtained scFv obtained from the unselected 
phage display library. The consensuses obtained from the randomly isolated scFv 
(rcH and rcL) are indicated. 

- represents gaps introduced to optimise alignment 

X represents positions at which no consensus could be assigned. 

Detailed Description of the Invention 

Unless defined otherwise, all technical and scientific terms used herein have the same 
meaning as commonly understood by one of ordinary skill in the art (e.g., in cell culture, 
molecular genetics, nucleic acid chemistry, hybridisation techniques and biochemistry). 
Standard techniques are used for molecular, genetic and biochemical methods. See, 
generally, Sambrook et al., Molecular Cloning: A Laboratory Manual, 2d ed. (1989) Cold 
Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y. and Ausubel et al., Short 
Protocols in Molecular Biology (1999) 4 th Ed, John Wiley & Sons, Inc.; as well as 
Guthrie et al., Guide to Yeast Genetics and Molecular Biology, Methods in Enzymology, 
Vol. 194, Academic Press, Inc., (1991), PGR Protocols: A Guide to Methods and 
Applications (Innis, et al. 1990. Academic Press, San Diego, Calif.), McPherson et al., 
PCR Volume 1, Oxford University Press, (1991), Culture of Animal Cells: A Manual of 
Basic Technique, 2nd Ed. (R. I. Freshney. 1987. Liss, Inc. New York, N.Y.), and Gene 
Transfer and Expression Protocols, pp. 109-128, ed. E. J. Murray, The Humana Press 
Inc., Clifton, N. J.). These documents are incorporated herein by reference. 

Definitions 

A nucleic acid, as referred to herein, is a sequence of nucleotides which is 
advantageously a DNA sequence. The nucleotides may be natural or synthetic, or a 
mixture of the two; the nucleic acids may be linear or circular in form as required. 

Immunoglobulins, according to the present invention include members of the 
immunoglobulin superfamily, a family of polypeptides which comprise the 
immunoglobulin fold characteristic of antibody molecules, which contains two p sheets 
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and, usually, a conserved disulphide bond. Members of the immunoglobulin superfamily 
are involved in many aspects of cellular and non-cellular interactions in vivo, including 
widespread roles in the immune system (for example, antibodies, T-cell receptor 
molecules and the like), involvement in cell adhesion (for example the ICAM molecules) 
5 and intracellular signalling (for example, receptor molecules, such as the PDGF receptor). 
The present invention is preferably relates to single domain immunoglobulins derived 
from all immunoglobulin superfamily molecules which are capable of binding to target 
molecules. Preferably, the present invention relates to antibody single domains, in 
particular heavy chain variable (V H ) domains. Single domain immunoglobulins are free 
10 of complementary domains, that is are not associated with other binding domains which, 
in nature or otherwise, may associate with the single domain to form a single composite 
binding site for a target. Specifically, Vh domains are not in the presence of 
complementary V L domains in the single domains of the invention. However, further 
domains, such as antibody constant region domains, may be but need not be present. 

15 

Antibodies, as used herein, refers to complete antibodies or antibody fragments capable 
of binding to a selected target, and including Fv, scFv, Fab' and F(ab') 2 , monoclonal and 
polyclonal antibodies, engineered antibodies including chimeric, CDR-grafted and 
humanised antibodies, and artificially selected antibodies produced using phage display 
20 or alternative techniques. Antibodies may be or be based on of any naturally-occurring 
antibody type, including IgG, IgE, IgA, IgD and IgM. Single domains, such as V H 
domains, may be derived from any such antibody. 

The framework region of an immunoglobulin heavy and/or light chain variable domain 
25 has a particular 3 dimensional conformation characterised by the presence of an 
immunoglobulin fold. Certain amino acid residues present in the variable domain are 
responsible for maintaining this characteristic immunoglobulin domain core structure. 
These residues are known as framework residues and tend to be highly conserved. The 
framework supports the CDRs of an antibody. 

30 

CDR (complementarity determining region) of an immunoglobulin molecule heavy 
and/or light chain variable domain describes those amino acid residues which are not 
framework region residues and which are contained within the hypervariable loops of the 
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variable regions. These hypervariable loops are directly involved with the interaction of 
the immunoglobulin with the ligand. Residues within these loops tend to show less degree 
of conservation than those in the framework region. 

5 Intracellular means inside a cell, and the present invention is directed to those 
immunoglobulins which will bind to ligands/targets selectively within a cell. The cell 
may be any cell, prokaryotic or eukaryotic, and is preferably selected from the group 
consisting of a bacterial cell, a yeast cell and a higher eukaryote cell. Most preferred are 
yeast cells and mammalian cells. As used herein, therefore, "intracellular" 

10 immunoglobulins and targets or ligands are immunoglobulins and targets/ligands which 
are present within a cell. In addition the term 'Intracellular' refers to environments which 
resemble or mimic an intracellular environment. Thus, "intracellular" may refer to an 
environment which is not within the cell, but is in vitro. For example, the method of the 
invention may be performed in an in vitro transcription and/or translation system, which 

1 5 may be obtained commercially, or derived from natural systems. 

Consensus frameworks in the context of the present invention refers to the consensus 
sequences of those Vh and Vl chains from immunoglobulin molecules which can bind 
selectively to a ligand in an intracellular environment. The residue which is most common 
20 in any one given position, when the sequences of those immunoglobulins which can bind 
intracellularly are compared is chosen as the consensus residue for that position. The 
consensus sequence is generated by comparing the residues for all the intracellularly 
binding immunoglobulins, at each position in turn, and then collating the data. 

25 Oligonucleotides are nucleic acids composed of a plurality of nucleotides. 
Advantageously, they are useful in assembly of larger nucleic acids as described herein. 
No limit in length is intended to be implied, and oligonucleotides may be short - 3 to 10 
nucleotides long - to long, 1000 nucleotides long and more. Preferably, oligonucleotides 
are synthesised. 

30 

Double stranded nucleic acids possess + and — strands, which equates to a coding strand 
and a non-coding strand in coding sequences. The + and - strands are complementary, 
and anneal to form a nucleic acid duplex. 
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Annealing is the process by which complementary nucleic acids hybridise to each other 
to form a double stranded nucleic acid. Nucleic acids do not need to be 100% 
complementary to anneal to each other. 

Effector domains are polypeptides which are capable of exerting a chemical or 
biological function. For example, antibody Fc regions comprise effector domains, which 
can be attached to scFv or Dab antibodies. Labels are advantageously protein labels, such 
as luminescent labels (for example GFP) or antigenic epitopes. 

Nucleic acids according to the invention are amplified using any available amplification 
technique. "Amplification" refers to the increase in the number of copies of a particular 
nucleic acid fragment (or a portion of this) resulting from an enzymatic chain reaction 
(such as a polymerase chain reaction, a ligase chain reaction, or a self-sustained sequence 
replication). Preferably, the amplification according to our invention is an exponential 
amplification, as exhibited by for example the polymerase chain reaction (PGR). 

Primers are nucleic acid molecules which are used to prime amplification reactions. 
They are complementary to a part of the nucleic acids which it is desired to amplify, and 
allow the amplification process to begin at the point at which they anneal. Primers are 
mutagenic if they insert, into a nucleic acid to be amplified, a mutation due to a base pair 
mismatch which is tolerated in the primer annealing to the nucleic acid but carried over to 
the amplified product. 

Amplification products are assembled, in the context of the present invention, by ligation 
or otherwise to produce a composite nucleic acid comprising more than one amplification 
product. In a preferred embodiment, the amplification products are assembled by PCR or 
another reaction involving nucleic acid replication which employs primers located at 
either end of the desired assembled molecule, thus directing replication and/or 
amplification of the full length assembled nucleic acid. 

Mutagenic primers may comprise randomised and/or semi-randomised sequence. 
Randomised sequence occurs at a position in the primer where any nucleotide, A, C, G or 
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T, may appear. This is usually denoted by the letter N in the sequence. Thus, a codon 
represented as 'NNN' may encode any amino acid, or a be a nonsense or stop codon. 
Semi-randomised sequences are restricted in the degree of randomisation, such that not 
any nucleotide may appear at any position. For instance, the nature of the third base in 
any codon may be restricted, to avoid the incidence of stop codons. The randomised 
sequence extends over a number of codons, which is the codon equivalence of the 
randomisation. A multiple codon equivalence indicates that two or more codons are 
(semi-) randomised. Preferably, between 3 and 12 codons are (semi-) randomised; 
advantageously, 6 to 10 codons are (semi-) randomised. 

Mutagenesis using randomised sequences as described leads to diversification of the 
sequence of the nucleic acids and/or primers according to the invention, and thus the 
generation of libraries of diversified polypeptides by expression of the nucleic acids. 
Expression is the transcription and/or translation of nucleic acid into the gene product it 
encodes; in the context of the present invention, that gene product is a polypeptide. Thus, 
expression is the transcription and/or translation of nucleic acids to form polypeptides. 
The polypeptides are advantageously immunoglobulins. 

An expression vector is a nucleic acid which comprises a coding sequence and the 
sequences necessary for that coding sequence to be expressed, as defined herein. 
Typically, an expression vector will be a plasmid, and will comprise one or more origins 
of replication, a promoter and optionally enhancer sequences to direct transcription of the 
coding sequences, and optionally one or more markers which allow the vector or cells 
containing the vector to be identified and/or selected for. 

Isolation, as referred to herein, is the purification of the desired substance from one or 
more undesired components with which it is associated. Thus, isolation of nucleic acids 
according to the invention indicates that the nucleic acids are purified from unreacted 
nucleotides, primers, enzymes and other reaction components with which they are 
associated with after a given reaction. The degree of purification need not be complete 
purification; it suffices to isolate the desired nucleic acids sufficiently to allow them to be 
used in further processes and/or reactions. 
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The methods of the invention involve the templated replication and/or amplification of 
desired nucleic acids. "Amplification" refers to the increase in the number of copies of a 
particular nucleic acid fragment (or a portion of this) resulting from an enzymatic chain 
reaction (such as a polymerase chain reaction, a ligase chain reaction, or a self-sustained 
sequence replication. Preferably, the amplification is an exponential amplification, as 
exhibited by for example the polymerase chain reaction. 

Many target and signal amplification methods have been described in the literature. See, 
for example, general reviews of these methods in Landegren, U., et al., Science 242:229- 
237 (1988) and Lewis, R., Genetic Engineering News 10:1, 54-55 (1990). These 
amplification methods may be used in the methods of the invention, and include 
polymerase chain reaction (PGR), PCR in situ, ligase amplification reaction (LAR), ligase 
hybridisation, Qbeta bacteriophage replicase, transcription-based amplification system 
(TAS), genomic amplification with transcript sequencing (GAWTS), nucleic acid 
sequence-based amplification (NASBA) and in situ hybridisation. The use of PCR is 
preferred. 

Polymerase Chain Reaction fPCR^ 

PCR is a nucleic acid amplification method described inter alia in U.S. Pat. Nos. 
4,683,195 and 4,683,202. PCR consists of repeated cycles of DNA polymerase generated 
primer extension reactions. The target DNA is heat denatured and two oligonucleotides, 
which bracket the target sequence on opposite strands of the DNA to be amplified, are 
hybridised. These oligonucleotides become primers for use with DNA polymerase. The 
DNA is copied by primer extension to make a second copy of both strands. By repeating 
the cycle of heat denaturation, primer hybridisation and extension, the target DNA can be 
amplified a million fold or more in about two to four hours. An advantage of PCR is that 
it increases sensitivity by amplifying the amount of target DNA by 1 million to 1 billion 
fold in approximately 4 hours. In the context of the present invention, PCR is used to 
amplify desired gene products, to assemble amplification products of other PCR reactions 
into full-length nucleic acids, and to introduce mutations into nucleic acids using primers 
which comprises randomised sequences. 
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Reverse transcriptase-PCR 

RT-PCR is used to amplify RNA targets. In this process, the reverse transcriptase enzyme 
is used to convert RNA to complementary DNA (cDNA), which can then be amplified 
using PGR. This method has proven useful for the detection of RNA viruses. 

The methods of the invention may employ RT-PCR. Thus, the nucleic acid encoding the 
immunoglobulin may be provided in the form of RNA. This RNA could be generated in 
vivo in bacteria, mammalian cells, yeast etc, and may for example be the transcription 
product of endogenous immunoglobulin genes. 

Ligation Amplification (XAR/LAS") 

Ligation amplification reaction or ligation amplification system uses DNA ligase and four 
oligonucleotides, two per target strand. This technique is described by Wu, D. Y. and 
Wallace, R. B. (1989) Genomics 4:560. The oligonucleotides hybridise to adjacent 
sequences on the target DNA and are joined by the ligase. The reaction is heat denatured 
and the cycle repeated. The opposite strand may be copied by any replicase enzyme. 

QB Replicase 

In this technique, RNA replicase for the bacteriophage QP, which replicates single- 
stranded RNA, is used to amplify the target DNA, as described by Lizardi et al. (1988) 
Bio/Technology 6:1197. First, the target DNA is hybridised to a primer including a T7 
promoter and a QP 5' sequence region. Using this primer, reverse transcriptase generates 
a cDNA connecting the primer to its 5' end in the process. These two steps are similar to 
the TAS protocol, The resulting heteroduplex is heat denatured. Next, a second primer 
containing a Qp 3' sequence region is used to initiate a second round of cDNA synthesis. 
This results in a double stranded DNA containing both 5' and 3' ends of the QP 
bacteriophage as well as an active T7 RNA polymerase binding site. T7 RNA polymerase 
then transcribes the double-stranded DNA into new RNA, which mimics the Qp, After 
extensive washing to remove any unhybridized probe, the new RNA is eluted from the 
target and replicated by Qp replicase. The latter reaction creates 10 7 fold amplification in 
approximately 20 minutes. Significant background may be formed due to minute amounts 
of probe RNA that is non-specifically retained during the reaction. 
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Other Amplification Techniques 

Alternative amplification technology may be exploited in the present invention. For 
example, rolling circle amplification (Lizardi et al, (1998) Nat Genet 19:225) is an 
5 amplification technology available commercially (RCAT™) which is driven by DNA 
polymerase and can replicate circular oligonucleotide probes with either linear or 
geometric kinetics under isothermal conditions. 

In the presence of two suitably designed primers, a geometric amplification occurs via 
10 DNA strand displacement and hyperbranching to generate 10 12 or more copies of each 
circle in 1 hour. 

A further technique, strand displacement amplification (SDA; Walker et al 9 (1992) 
PNAS (USA) 80:392) begins with a specifically defined sequence unique to a specific 
15 target. But unlike other techniques which rely on thermal cycling, SDA is an isothermal 
process that utilises a series of primers, DNA polymerase and a restriction enzyme to 
exponentially amplify the unique nucleic acid sequence. 

SDA comprises both a target generation phase and an exponential amplification phase. 

20 In target generation, double-stranded DNA is heat denatured creating two single-stranded 
copies. A series of specially manufactured primers combine with DNA polymerase 
(amplification primers for copying the base sequence and bumper primers for displacing 
the newly created strands) to form altered targets capable of exponential amplification. 
The exponential amplification process begins with altered targets (single-stranded partial 

25 DNA strands with restricted enzyme recognition sites) from the target generation phase. 
An amplification primer is bound to each strand at its complimentary DNA sequence. 
DNA polymerase then uses the primer to identify a location to extend the primer from its 
J end, using the altered target as a template for adding individual nucleotides. The 
extended primer thus forms a double-stranded DNA segment containing a complete 

30 restriction enzyme recognition site at each end. 



A restriction enzyme is then bound to the double stranded DNA segment at its recognition 
site. The restriction enzyme dissociates from the recognition site after having cleaved 
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only one strand of the double-sided segment, forming a nick. DNA polymerase recognises 
the nick and extends the strand from the site, displacing the previously created strand. 
The recognition site is thus repeatedly nicked and restored by the restriction enzyme and 
DNA polymerase with continuous displacement of DNA strands containing the target 
5 segment. 

Each displaced strand is then available to anneal with amplification primers as above. The 
process continues with repeated nicking, extension and displacement of new DNA 
strands, resulting in exponential amplification of the original DNA target. 

10 

EXPRESSION SYSTEMS 

Methods which are well known to those skilled in the art are used to construct expression 
vectors containing sequences encoding immunoglobulins according to the invention and 

15 appropriate transcriptional and translational control elements. These methods include in 
vitro recombinant DNA techniques, synthetic techniques, and in vivo genetic 
recombination. Such techniques are described in Sambrook, J. et al. (1989; Molecular 
Cloning, A Laboratory Manual, ch. 4, 8, and 16-17, Cold Spring Harbor Press, Plainview, 
N.Y.) and Ausubel, F. M. et al (1995 and periodic supplements; Current Protocols in 

20 Molecular Biology, ch. 9, 13, and 16, John Wiley & Sons, New York, N.Y.). 

A variety of expression vector/host systems may be utilised to contain and express 
sequences encoding immunoglobulins. These include, but are not limited to, 
microorganisms such as bacteria transformed with recombinant bacteriophage, plasmid, 

25 or cosmid DNA expression vectors; yeast transformed with yeast expression vectors; 
insect cell systems infected with virus expression vectors (e.g., baculovirus); plant cell 
systems transformed with virus expression vectors (e.g., cauliflower mosaic virus 
(CaMV) or tobacco mosaic virus (TMV)) or with bacterial expression vectors (e.g., Ti or 
pBR322 plasmids); or animal ceil systems. The invention is not limited by the host cell 

30 employed. 

The "control elements" or "regulatory sequences" are those non-translated regions of the 
vector (i.e., enhancers, promoters, and 5' and 3' untranslated regions) which interact with 



WO 2004/046189 PCT/GB2003/004964 

19 

host cellular proteins to carry out transcription and translation. Such elements may vary in 
their strength and specificity. Depending on the vector system and host utilised, any 
number of suitable transcription and translation elements, including constitutive and 
inducible promoters, may be used. For example, when cloning in bacterial systems, 
inducible promoters such as the hybrid lacZ promoter of the BLUESCRIPT phagemid 
(Stratagene, La Jolla, Calif) or PSPorTl plasmid (GIBCO/BRL), and the like, may be 
used. The baculovirus polyhedrin promoter may be used in insect cells. Promoters or 
enhancers derived from the genomes of plant cells (e.g., heat shock, RUBISCO, and 
storage protein genes) or from plant viruses (e.g., viral promoters or leader sequences) 
may be cloned into the vector. In mammalian cell systems, promoters from mammalian 
genes or from mammalian viruses are preferable. If it is necessary to generate a cell line 
that contains multiple copies of the sequence encoding an immunoglobulin, vectors based 
on S V40 or EBV may be used with an appropriate selectable marker. 

In bacterial systems, a number of expression vectors may be selected. For example, 
vectors which direct high level expression of fusion proteins may be used. Such vectors 
include, but are not limited to, multifunctional E. coli cloning and expression vectors such 
as BLUESCRIPT (Stratagene), in which the sequence encoding the immunoglobulin may 
be ligated into the vector in frame with sequences for the amino-terminal Met and the 
subsequent 7 residues of p-galactosidase so that a hybrid protein is produced, pIN vectors 
(Van Heeke, G. and S. M. Schuster (1989) J. Biol. Chem. 264:5503-5509), and the like. 
pGEX vectors (Promega, Madison, Wis.) may also be used to express foreign 
polypeptides as fusion proteins with glutathione S-transferase (GST). In general, such 
fusion proteins are soluble and can easily be purified from lysed cells by adsorption to 
glutathione-agarose beads followed by elution in the presence of free glutathione. 
Proteins made in such systems may be designed to include heparin, thrombin, or factor 
XA protease cleavage sites so that the cloned polypeptide of interest can be released from 
the GST moiety at will. 

In the yeast Saccharomyces cerevisiae, a number of vectors containing constitutive or 
inducible promoters, such as alpha factor, alcohol oxidase, and PGH, may be used. For 
reviews, see Ausubel (supra) and Grant et al. (1987; Methods Enzymol. 153:516-544). 
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In cases where plant expression vectors are used, the expression of sequences encoding 
the immunoglobulin may be driven by any of a number of promoters. For example, viral 
promoters such as the 35S and 19S promoters of CaMV may be used alone or in 
combination with the omega leader sequence from TMV. (Takamatsu, N. (1987) EMBO 
5 J. 6:307-311.) Alternatively, plant promoters such as the small subunit of RUBISCO or 
heat shock promoters may be used. (Coruzzi, G. et al. (1984) EMBO J. 3:1671-1680; 
Broglie, R. et al. (1984) Science 224:838-843; and Winter, J. et al. (1991) Results Probl. 
Cell Differ. 17:85-105.) These constructs can be introduced into plant cells by direct 
DNA transformation or pathogen-mediated transfection. Such techniques are described in 
10 a number of generally available reviews. (See, for example, Hobbs, S. or Murry, L. E. in 
McGraw Hill Yearbook of Science and Technology (1992) McGraw Hill, New York, 
N.Y.;pp. 191-196.). 

An insect system may also be used to express the immunoglobulin. For example, in one 
15 such system, Autographa californica nuclear polyhedrosis virus (AcNPV) is used as a 
vector to express foreign genes in Spodoptera frugiperda cells or in Trichoplusia larvae. 
The sequences encoding the immunoglobulin may be cloned into a non-essential region 
of the virus, such as the polyhedrin gene, and placed under control of the polyhedrin 
promoter. Successful insertion of the immunoglobulin gene will render the polyhedrin 
20 gene inactive and produce recombinant virus lacking coat protein. The recombinant 
viruses may then be used to infect, for example, S. frugiperda cells or Trichoplusia larvae 
in which GPR54 polypeptide may be expressed. (Engelhard, E. K. et al. (1994) Proc. Nat. 
Acad. Sci. 91:3224-3227.) 

25 In mammalian host cells, a number of viral-based expression systems may be utilised. In 
cases where an adenovirus is used as an expression vector, sequences encoding the 
immunoglobulin may be ligated into an adenovirus transcription/translation complex 
consisting of the late promoter and tripartite leader sequence. Insertion in a non-essential 
El or E3 region of the viral genome may be used to obtain a viable virus which is capable 

30 of expressing the immunoglobulin in infected host cells. (Logan, J. and T. Shenk (1984) 
Proc. Natl. Acad. Sci. 81:3655-3659.) In addition, transcription enhancers, such as the 
Rous sarcoma virus (RSV) enhancer, may be used to increase expression in mammalian 
host cells. 
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Human artificial chromosomes (HACs) may also be employed to deliver larger fragments 
of DNA than can be contained and expressed in a plasmid. HACs of about 6 kb to 10 Mb 
are constructed and delivered via conventional delivery methods (liposomes, polycationic 
amino polymers, or vesicles) for therapeutic purposes. 

Specific initiation signals may also be used to achieve more efficient translation of 
sequences encoding the immunoglobulin. Such signals include the ATG initiation codon 
and adjacent sequences. In cases where sequences encoding the immunoglobulin and its 
initiation codon and upstream sequences are inserted into the appropriate expression 
vector, no additional transcriptional or translational control signals may be needed. 
However, in cases where only coding sequence, or a fragment thereof, is inserted, 
exogenous translational control signals including the ATG initiation codon should be 
provided. Furthermore, the initiation codon should be in the correct reading frame to 
ensure translation of the entire insert. Exogenous translational elements and initiation 
codons may be of various origins, both natural and synthetic. The efficiency of expression 
may be enhanced by the inclusion of enhancers appropriate for the particular cell system 
used, such as those described in the literature. (Scharf, D. et al. (1994) Results Probl. Cell 
Differ. 20:125-162.) 

In addition, a host cell strain may be chosen for its ability to modulate expression of the 
inserted sequences or to process the expressed protein in the desired fashion. Such 
modifications of the polypeptide include, but are not limited to, acetylation, 
carboxylation, glycosylation, phosphorylation, lipidation, and acylation. Post-translational 
processing which cleaves a "prepro" form of the protein may also be used to facilitate 
correct insertion, folding, and/or function. Different host cells which have specific 
cellular machinery and characteristic mechanisms for post-translational activities (e.g., 
CHO, HeLa, MDCK, HEK293, and WI38), are available from the American Type 
Culture Collection (ATCC, Bethesda, Md.) and may be chosen to ensure the correct 
modification and processing of the foreign protein. 

For long term, high yield production of recombinant proteins, stable expression is 
preferred. For example, cell lines capable of stably expressing the immunoglobulin can be 
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transformed using expression vectors which may contain viral origins of replication 
and/or endogenous expression elements and a selectable marker gene on the same or on a 
separate vector. Following the introduction of the vector, cells may be allowed to grow 
for about 1 to 2 days in enriched media before being switched to selective media. The 
5 purpose of the selectable marker is to confer resistance to selection, and its presence 
allows growth and recovery of cells which successfully express the introduced sequences. 
Resistant clones of stably transformed cells may be proliferated using tissue culture 
techniques appropriate to the cell type. 

10 Any number of selection systems may be used to recover transformed cell lines. These 
include, but are not limited to, the herpes simplex virus thymidine kinase genes (Wigler, 
M. et al. (1977) Cell 11:223-32) and adenine phosphoribosyltransferase genes (Lowy, I. 
et al. (1980) Cell 22:817-23), which can be employed in tk" or apr" cells, respectively. 
Also, antimetabolite, antibiotic, or herbicide resistance can be used as the basis for 

15 selection. For example, dhfr confers resistance to methotrexate (Wigler, M. et al. (1980) 
Proc. Natl. Acad. Sci. 77:3567-70); npt confers resistance to the aminoglycosides 
neomycin and G-418 (Colbere-Garapin, F. et al (1981) J. Mol. Biol. 150:1-14); and als or 
pat confer resistance to chlorsulfuron and phosphinotricin acetyltransferase, respectively 
(Murry, supra). Additional selectable genes have been described, for example, trpB, 

20 which allows cells to utilise indole in place of tryptophan, or hisD, which allows cells to 
utilise histinol in place of histidine. (Hartman, S. C. and R. C. Mulligan (1988) Proc. Natl. 
Acad. Sci. 85:8047-51.) Recently, the use of visible markers has gained popularity with 
such markers as anthocyanins, P -glucuronidase and its substrate GUS, and luciferase and 
its substrate luciferin. These markers can be used not only to identify transformants, but 

25 also to quantify the amount of transient or stable protein expression attributable to a 
specific vector system. (Rhodes, C. A. et al. (1995) Methods Mol. Biol. 55:121-131.) 

Although the presence/absence of marker gene expression suggests that the gene of 
interest is also present, the presence and expression of the gene may need to be 
30 confirmed. For example, if the sequence encoding the immunoglobulin is inserted within 
a marker gene sequence, transformed cells containing sequences encoding the 
immunoglobulin can be identified by the absence of marker gene function. Alternatively, 
a marker gene can be placed in tandem with a sequence encoding the immunoglobulin 
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under the control of a single promoter. Expression of the marker gene in response to 
induction or selection usually indicates expression of the tandem gene as well. 

Alternatively, host cells which express the immunoglobulin may be identified by a variety 
5 of procedures known to those of skill in the art. These procedures include, but are not 
limited to, DNA-DNA or DNA-RNA hybridizations and immunoassay techniques which 
include membrane, solution, or chip based technologies for the detection and/or 
quantification of nucleic acid or protein sequences. 

10 The presence of polynucleotide sequences encoding the immunoglobulin can be detected 
by DNA-DNA or DNA-RNA hybridisation or amplification using probes or fragments or 
fragments of polynucleotides encoding the immunoglobulin. Nucleic acid amplification 
based assays involve the use of oligonucleotides or oligomers based on the sequences 
encoding the immunoglobulin, as described above, to detect transformants containing 

1 5 DNA or RNA encoding the immunoglobulin. 

A variety of protocols for detecting and measuring the expression of immunoglobulins are 
known in the art. Examples of such techniques include enzyme-linked immunosorbent 
assays (ELISAs), radioimmunoassays (RIAs), and fluorescence activated cell sorting 
20 (FACS). These and other assays are well described in the art, for example, in Hampton, 
R. et al. (1990; Serological Methods, a Laboratory Manual, Section IV, APS Press, St 
Paul, Minn.) and in Maddox, D. E. et al. (1983; J. Exp. Med. 158:121 1-1216). 

A wide variety of labels and conjugation techniques are known by those skilled in the art 
25 and may be used in various nucleic acid and amino acid assays. Means for producing 
labelled hybridisation or PGR probes for detecting sequences related to polynucleotides 
encoding immunoglobulins include oligolabeling, nick translation, end-labelling, or PGR 
amplification using a labelled nucleotide. Alternatively, the sequences encoding the 
immunoglobulin, or any fragments thereof, may be cloned into a vector for the production 
30 of an mRNA probe. Such vectors are known in the art, are commercially available, and 
may be used to synthesise RNA probes in vitro by addition of an appropriate RNA 
polymerase such as T7, T3, or SP6 and labelled nucleotides. These procedures may be 
conducted using a variety of commercially available kits, such as those provided by 
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Pharmacia & Upjohn (Kalamazoo, Mich.), Promega (Madison, Wis.), and U.S. 
Biochemical Corp. (Cleveland, Ohio). Suitable reporter molecules or labels which may be 
used for ease of detection include radionuclides, enzymes, fluorescent, chemiluminescent, 
or chromogenic agents, as well as substrates, cofactors, inhibitors, magnetic particles, and 
the like. 

Host cells transformed with nucleotide sequences encoding the immunoglobulin may be 
cultured under conditions suitable for the expression and recovery of the protein from cell 
culture. The protein produced by a transformed cell may be located in the cell membrane, 
secreted or contained intracellularly depending on the sequence and/or the vector used. As 
will be understood by those of skill in the art, expression vectors containing 
polynucleotides which encode the immunoglobulin may be designed to contain signal 
sequences which direct secretion of the immunoglobulin through a prokaryotic or 
eukaryotic cell membrane. Other constructions may be used to join sequences encoding 
the immunoglobulin to nucleotide sequences encoding a polypeptide domain which will 
facilitate purification of soluble proteins. Such purification facilitating domains include, 
but are not limited to, metal chelating peptides such as histidine-tryptophan modules that 
allow purification on immobilised metals, protein A domains that allow purification on 
immobilised immunoglobulin, and the domain utilised in the FLAGS extension/affinity 
purification system (Immunex Corp., Seattle, Wash.). The inclusion of cleavable linker 
sequences, such as those specific for Factor XA or enterokinase (Invitrogen, San Diego, 
Calif), between the purification domain and the immunoglobulin-encoding sequence may 
be used to facilitate purification. One such expression vector provides for expression of a 
fusion protein containing the immunoglobulin and a nucleic acid encoding 6 histidine 
residues preceding a thioredoxin or an enterokinase cleavage site. The histidine residues 
facilitate purification on immobilised metal ion affinity chromatography (IMIAC; 
described in Porath, J. et al. (1992) Prot. Exp. Purif. 3: 263-281), while the enterokinase 
cleavage site provides a means for purifying GPR54 from the fusion protein. A discussion 
of vectors which contain fusion proteins is provided in Kroll, D. J. et al. (1993; DNA Cell 
Biol. 12:441-453). 
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Immunoglobulin molecules, according to the present invention, refer to members of the 
immunoglobulin superfamily, a family of polypeptides which comprise the 
5 immunoglobulin fold characteristic of antibody molecules, which contains two p sheets 
and, usually, a conserved disulphide bond. Members of the immunoglobulin superfamily 
are involved in many aspects of cellular and non-cellular interactions in vivo, including 
widespread roles in the immune system (for example, antibodies, T-cell receptor 
molecules and the like), involvement in cell adhesion (for example the ICAM molecules) 
10 and intracellular signalling (for example, receptor molecules, such as the PDGF receptor). 
The present invention is applicable to all immunoglobulin superfamily molecules which 
are capable of binding to target molecules. Preferably, the present invention relates to 
antibodies. 

15 Antibodies, as used herein, refers to complete antibodies or antibody fragments capable of 
binding to a selected target, and including Fv, scFv, Fab 1 and F(ab') 2 , monoclonal and 
polyclonal antibodies, engineered antibodies including chimeric, CDR-grafted and 
humanised antibodies, and artificially selected antibodies produced using phage display 
or alternative techniques. Small fragments, such as Fv and scFv, possess advantageous 

20 properties for diagnostic and therapeutic applications on account of their small size and 
consequent superior tissue distribution. Preferably, the antibody is a single chain 
antibody or scFv. 

The antibodies according to the invention are especially indicated for diagnostic and 
25 therapeutic applications. Accordingly, they may be altered antibodies comprising an 
effector protein such as a toxin or a label. Especially preferred are labels which allow the 
imaging of the distribution of the antibody in vivo. Such labels may be radioactive labels 
or radioopaque labels, such as metal particles, which are readily visualisable within the 
body of a patient. Moreover, they may be fluorescent labels or other labels which are 
30 visualisable on tissue samples removed from patients. Effector groups may be added 
during the synthesis of the antibodies by the method of the present invention, or 
afterwards. 
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Recombinant DNA technology may be used to produce the antibodies of then invention 
according to established procedure, in bacterial or preferably mammalian cell culture. 
The selected cell culture system preferably secretes the antibody product. 

5 Multiplication of mammalian host cells in vitro is carried out in suitable culture media, 
which are the customary standard culture media, for example Dulbecco's Modified Eagle 
Medium (DMEM) or RPMI 1640 medium, optionally replenished by a mammalian 
serum, e.g. foetal calf serum, or trace elements and growth sustaining supplements, e.g. 
feeder cells such as normal mouse peritoneal exudate cells, spleen cells, bone marrow 

10 macrophages, 2-aminoethanol, insulin, transferrin, low density lipoprotein, oleic acid, or 
the like. Multiplication of host cells which are bacterial cells or yeast cells is likewise 
carried out in suitable culture media known in the art, for example for bacteria in medium 
LB, NZCYM, NZYM, NZM, Terrific Broth, SOB, SOC, 2 x YT, or M9 Minimal 
Medium, and for yeast in medium YPD, YEPD, Minimal Medium, or Complete Minimal 

1 5 Dropout Medium. 

In vitro production provides relatively pure antibody preparations and allows scale-up to 
give large amounts of the desired antibodies. Techniques for bacterial cell, yeast or 
mammalian cell cultivation are known in the art and include homogeneous suspension 
20 culture, e.g. in an airlift reactor or in a continuous stirrer reactor, or immobilised or 
entrapped cell culture, e.g. in hollow fibres, microcapsules, on agarose microbeads or 
ceramic cartridges. 

The foregoing, and other, techniques are discussed in, for example, Harlow and Lane, 
25 Antibodies: a Laboratory Manual, (1988) Cold Spring Harbor, incorporated herein by 
reference. Techniques for the preparation of recombinant antibody molecules is 
described in the above reference and also in, for example, EP 0623679, EP 0368684 and 
EP 0436597, which are incorporated herein by reference. 

30 The cell culture supernatants are screened for the desired antibodies, preferentially by 
immunofluorescent staining of cells expressing the desired target by immunoblotting, by 
an enzyme immunoassay, e.g. a sandwich assay or a dot-assay, or a radioimmunoassay. 
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For isolation of the antibodies, the immunoglobulins in the culture supernatants or in the 
ascitic fluid may be concentrated, e.g. by precipitation with ammonium sulphate, dialysis 
against hygroscopic material such as polyethylene glycol, filtration through selective 
membranes, or the like. If necessary and/or desired, the antibodies are purified by the 
5 customary chromatography methods, for example gel filtration, ion-exchange 
chromatography, chromatography over DEAE-cellulose and/or (immuno-)affinity 
chromatography, e.g. affinity chromatography with the target molecule or with Protein-A. 

Immunoglobulin libraries constructed according to the invention may used in any library 

10 selection procedure. Selection protocols for isolating desired members of libraries are 
known in the art, as typified by phage display techniques. Such systems, in which diverse 
peptide sequences are displayed on the surface of filamentous bacteriophage have proven 
useful for creating libraries of antibody fragments (and the nucleotide sequences that 
encoding them) for the in vitro selection and amplification of specific antibody fragments 

15 that bind a target antigen. The nucleotide sequences encoding the Vh and Vl regions are 
linked to gene fragments which encode leader signals that direct them to the periplasmic 
space of E. coli and as a result the resultant antibody fragments are displayed on the 
surface of the bacteriophage, typically as fusions to bacteriophage coat proteins (e.g., pill 
or pVIII). Alternatively, antibody fragments are displayed externally on lambda phage 

20 capsids (phagebodies). An advantage of phage-based display systems is that, because they 
are biological systems, selected library members can be amplified simply by growing the 
phage containing the selected library member in bacterial cells. Furthermore, since the 
nucleotide sequence that encode the polypeptide library member is contained on a phage 
or phagemid vector, sequencing, expression and subsequent genetic manipulation is 

25 relatively straightforward. 

Methods for the construction of bacteriophage antibody display libraries and lambda 
phage expression libraries are well known in the art (McCafferty et al. (1990) supra; 
Kang et al (1991) Proa Natl. Acad Set U.S.A., 88: 4363; Clackson et al (1991) Nature, 
30 352: 624; Lowman et al (1991) Biochemistry, 30: 10832; Burton et al (1991) Proc. Natl 
Acad. Sci U.S.A., 88: 10134; Hoogenboom et al (1991) Nucleic Acids Res., 19: 4133; 
Chang et al (1991) J. Immunol, 147: 3610; Breitling et al (1991) Gene, 104: 147; Marks 
et al (1991) supra; Barbas et al (1992) supra; Hawkins and Winter (1992) J. Immunol, 
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22: 867; Marks et al, 1992, J. Biol. Ghent., 267: 16007; Lerner et al (1992) Science, 258: 
1313, incorporated herein by reference). 

One particularly advantageous approach has been the use of scFv phage-libraries (Huston 
5 et al, 1988, Proc. Natl. Acad. Sci U.S.A., 85: 5879-5883; Chaudhary et al (1990) Proc. 
Natl. Acad. Sci U.S.A., 87: 1066-1070; McCafferty et al (1990) supra; Clackson et al 
(1991) supra; Marks et al (1991) supra; Chiswell et al (1992) Trends Biotech., 10: 80; 
Marks et al (1992) supra). Various embodiments of scFv libraries displayed on 
bacteriophage coat proteins have been described. Refinements of phage display 
10 approaches are also known, for example as described in WO96/06213 and WO92/01047 
(Medical Research Council et al) and WO97/08320 (Morphosys), which are incorporated 
herein by reference. 

Alternative library selection technologies include bacteriophage lambda expression 
15 systems, which may be screened directly as bacteriophage plaques or as colonies of 
lysogens, both as previously described (Huse et al (1989) Science, 246: 1275; Caton and 
Koprowski (1990) Proc. Natl Acad. Sci U.S.A., 87; Mullinax et al (1990) Proc. Natl 
Acad. Sci. U.S.A., 87: 8095; Persson et al (1991) Proc. Natl Acad. Sci. U.S.A., 88: 2432) 
and are of use in the invention. Whilst such expression systems can be used to screening 
20 up to 10 6 different members of a library, they are not really suited to screening of larger 
numbers (greater than 10 6 members). Other screening systems rely, for example, on direct 
chemical synthesis of library members. One early method involves the synthesis of 
peptides on a set of pins or rods, such as described in WO84/03564. A similar method 
involving peptide synthesis on beads, which forms a peptide library in which each bead is 
25 an individual library member, is described in U.S. Patent No. 4,631,211 and a related 
method is described in WO92/00091. A significant improvement of the bead-based 
methods involves tagging each bead with a unique identifier tag, such as an 
oligonucleotide, so as to facilitate identification of the amino acid sequence of each 
library member. These improved bead-based methods are described in WO93/06121. 

30 

Another chemical synthesis method involves the synthesis of arrays of peptides (or 
peptidomimetics) on a surface in a manner that places each distinct library member (e.g., 
unique peptide sequence) at a discrete, predefined location in the array. The identity of 
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each library member is determined by its spatial location in the array. The locations in the 
array where binding interactions between a predetermined molecule (e.g., a receptor) and 
reactive library members occur is determined, thereby identifying the sequences of the 
reactive library members on the basis of spatial location. These methods are described in 
U.S. Patent No. 5,143,854; WO90/15070 and WO92/10092; Fodor et al (1991) Science, 
251: 767; Dower and Fodor (1991) Ann. Rep. Med. Chern., 26: 271. 

Other systems for generating libraries of polypeptides or nucleotides involve the use of 
cell-free enzymatic machinery for the in vitro synthesis of the library members. In one 
method, RNA molecules are selected by alternate rounds of selection against a target 
ligand and PCR amplification (Tuerk and Gold (1990) Science, 249: 505; Ellington and 
Szostak (1990) Nature, 346: 818). A similar technique may be used to identify DNA 
sequences which bind a predetermined human transcription factor (Thiesen and Bach 
(1990) Nucleic Acids Res., 18: 3203; Beaudry and Joyce (1992) Science, 257: 635; 
WO92/05258 and W092/14843). In a similar way, in vitro translation can be used to 
synthesise polypeptides as a method for generating large libraries. These methods which 
generally comprise stabilised polysome complexes, are described further in W08 8/08453, 
WO90/05785, WO90/07003, WO91/02076, WO91/05058, and WO92/02536. Alternative 
display systems which are not phage-based, such as those disclosed in W095/22625 and 
W095/11922 (Affymax) use the polysomes to display polypeptides for selection. These 
and all the foregoing documents also are incorporated herein by reference. 

A preferred selection procedure for intracellular immunoglobulins which are stable in an 
intracellular environment, are correctly folded and are functional with respect to the 
selective binding of their ligand within that environment is described in WO00/54057. In 
this approach, the antibody-antigen interaction method uses antigen linked to a DNA- 
binding domain as a bait and the scFv linked to a transcriptional activation domain as a 
prey. Specific interaction of the two facilitates transcriptional activation of a selectable 
reporter gene. An initial in-vitro binding step is performed in which an antigen is assayed 
for binding to a repertoire of immunoglobulin molecules. Those immunoglobulins which 
are found to bind to their ligand in vitro assays are then assayed for their ability to bind to 
a selected antigen in an intracellular environment, generally in a cytoplasmic 
environment. 
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INTRACELLULAR CONSENSUS FRAMEWORKS 

Intracellular immunoglobulins are advantageously based on an intracellular consensus 
5 sequence framework. Advantageously, the consensus is described by at least one of the 
consensus sequences described in Figure 5 and set forth in Tse, E., Lobato, M. N., 
Forster, A., Tanaka, T. 5 Chung, T. Y. G. and Rabbitts, T. H. (2002) J, Mol Biol, 317, 85- 
94. Advantageously, the "consensus" used in the present invention is at least 85% 
identical to that shown in Figure 5; preferably 90%, 95%, 96%, 97%, 98%, 99% or 100% 
10 identical thereto. Preferably, in the calculation of identity, the amino acid residues of 
CDR3 are excluded from consideration. 

Intracellular consensus frameworks have been demonstrated, as described in our 
copending international patent application PCT/GB02/003512, to provide a basis for the 
1 5 construction of immunoglobulins which are stable intracellularly in vivo. 

The consensus frameworks can be used as a basis for design and construction of 
immunoglobulins according to the present invention. Oligonucleotides based on the 
consensus can be used to construct immunoglobulin genes as described; moreover, 
20 mutagenesis of CDR sequences using the consensus frameworks allows the generation of 
libraries of intracellularly active immunoglobulins. 

The invention is further illustrated below, in the following examples. 

25 Examples 

MATERIALS AND METHODS 

Mammalian transactivation domain vector pEF-VP16 

The vector pEF-VP16 was constructed for expression of scFv prey in mammalian two 
30 hybrid assays. In this vector, scFv sequences may be cloned into Sfil-Notl sites in-frame 
with the VP 16 transcriptional transactivator domain (AD) to make a fusion gene 
controlled by the promoter of the polypeptide elongation factor la (EF-la), which allows 
high protein expression in mammalian cells (Mizushima, S. and Nagata, S. (1990) Nucl 
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Acids Res., 18, 5322). The VP 16 activation domain fragment, including nuclear 
localisation signal (nls), was amplified by PGR using pNLVP16 as template and the 
VP 16 AD fragment was sub-cloned into the Notl site of pEF/myc/cyto (Invitrogen). To 
change the Sfil cloning site of pEF/myc/cyto for the Sfil site compatible for most scFv 
5 fragments, the Sfil region of this vector was mutagemsed using two oligonucleotides 5'- 
CGTGAACACGTGGTGGCCCAGCCGGCCCAGGTGCAGC and 5'- 

GCTGCACCTGGGCCGGCTGGGGGGCCACGTGTTCACG by QuikChange Site- 
directed Mutagenesis Kit (Stratagene) according to Manufacture's instructions. The final 
clone has the EF-la promoter, a multi-cloning site including Sfil -Notl sites compatible 
10 for scFv fragment insertions, a nuclear localisation signal and the VP 16 AD (Figure 1 A). 

De novo antibody gene synthesis 

For antibody gene synthesis, oligonucleotides were designed on an scFv coding sequence 
comprising the VH and VL framework of the intrabody consensus (Tse, E., Lobato, M. 

15 N., Forster, A., Tanaka, T., Chung, T. Y. G. and Rabbitts, T. H. (2002) J. Mol Biol, 317, 
85-94) and the CDRs of an anti-p-galactosidase scFv R4 (Martineau, P., Jones, P. and 
Winter, G. (1998) J Mol Biol 280, 117-127) (Figure IB). The double strands of DNA 
were divided into 18 oligonucleotides, of which 16 are 90 bases long and the 2 
oligonucleotides flanking the ends of the scFv are respectively 100 bases on the 5 9 end and 

20 60 bases on the 3 9 end. Each opposite strand oligonucleotide overlaps by 40-50 bases to 
ensure good annealing. All crude oligonucleotides were purified on 8% polyacrylamide 
gels containing 7M urea and visualised by UV shadowing, using fluorescent thin-layer 
chromatographic plates. Oligonucleotides were eluted by soaking the gel slice in 0.3M 
sodium acetate overnight at room temperature (~20°C). The supernatant was collected by 

25 centrifiigation and the oligonucleotides were precipitated with ethanol. The concentration 
of the purified oligonucleotides were calculated from the absorption spectrum. One jag of 
each of the purified oligonucleotides was phosphorylated in a final volume of 100jnl in 
the presence of 2jlx1 T4 polynucleotide kinase (10U/|Lil) and lmM rATP. The volume was 
increased to 100|il using NTE (lOOmM NaCl, lOmM Tris, lmM EDTA) and 

30 phosphorylation carried out by incubation at 37°C for 30 min. The reaction was stopped 
by incubation at 70°C for 10 min. The oligonucleotides were annealed after boiling the 
reaction for 30sec and allowing to cool to room temperature (~20°C) over 40 min. 
Ligation of the annealed oligonucleotides was carried out using 1 Sjitl of the annealed 
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mixture, 2jil of 10X T4 ligase buffer and l{il T4 DNA ligase (400U/|al) in a final volume 
of 20|iiL The mixture was incubated at 15°C overnight. The assembled oligonucleotides 
were finally PCR amplified with conseSFI and conseNOT primers (see Figure IB), which 
include Sfil site at the 5 9 end and a Notl site at the 3 5 end for sub-cloning into pEF-VP16. 
5 A master mix for 5 PCR reactions (final volume 30 |il) was prepared containing 500ng of 
each primer (i.e. conseSFI and conseNOT), 2.5U pfu polymerase, 0.2mM dNTPs, IX 
PCR reaction buffer and 1 |xl of the ligated oligo mixture. PCR reaction conditions were 
denaturation at 94°C for 5 min 9 followed by 30 cycles of 94°C for 1 min, 55°C for 1 min 
and 72°C for 1 min, and a final extension at 72°C for 5 min. The PCR product was 

10 separated on a 1% agarose gel and purified using QIAEXII gel purification kit (Quiagen). 
The purified product (eluted in 40|til elution buffer) and the expression vector was 
digested with l|il Sfil (lOU/jxl) in a volume of 30pl at 50°C for 5-6 hours and vector 
linearisation was checked on an aliquot before proceeding to the Notl digestion. If the 
Sfil digestions appeared complete, digestion with l|ul Notl (10U/jil) was carried out at 

15 37°C for 16 hours. The digested PCR products were purified on agarose gels, ligated 
with vector using T4 ligase at 15°C for 16 hours and transformed into E.coli TG-1. The 
constructs were verified by restriction enzyme digestion using Sfil and Notl and by DNA 
sequence analysis. 



20 Footprint mutagenesis 

Specific mutations of the framework regions of anti-RAS scFv33 (see our copending UK 
patent application "Anti-activated RAS antibodies" filed on even date herewith) into 
those of anti-RAS scFvI21 was achieved by PCR-based mutagenesis (herein called 
footprint mutagenesis). This was done firstly to investigate whether specific amino acid 

25 substitutions would affect in vivo function of the anti-RAS intrabodies (i.e. antigen 
binding ability). Anti-RAS scFv33 mutants are listed in Table 1A and were constructed 
following the flow chart of footprint mutagenesis method shown in Figure 2A. The 
location of the mutant primer sequences relative to the scFv33 sequence are shown in 
Figure 2B. Two initial templates were used, either pEF-scF v3 3 - VP 1 6 or pEF-scFvI21- 

30 VP16 (respectively scFv33 and scFvI21 cloned in pEF-VP16) for PCR as listed in Table 
1. Each mutagenesis comprised synthesis of two overlapping PCR products using mutant 
oligonucleotides (Figure 2 A, step 1) followed by complete assembly (Figure 2 A, step 2) 
and cloning into the pEF-VP16 vector (Figure 2 A, step 3) to generate the mutated 
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template for the next round of footprint mutagenesis (repeat). At each step the functional 
validity of the changes was estimated. Step 1 PCR reactions (final volume 20 \xl) 
contained 0.5 (iM of each primer pair, 2.5U pfu polymerase, 0.2mM dNTPs, IX PCR 
reaction buffer and 50 ng of pEF-scFv-VP16 template. PCR reactions were carried out 
5 by denaturation at 95°C for 5min, followed by 30 cycles of 95°C for 30sec, 60°C for 
30sec and 75°C for 45sec, and a final extension at 75°C for lOmin. Following PCR 
amplification, the amplified DNA fragments were electrophoresed on 2% agarose, 
extracted and purified by using QIAquick Gel Extraction Kit (Qiagen). Purified PCR 
fragments were assembled and amplified by PCR in Step 2 with pEF-VP16 vector 

10 primers EFFP (5 '-TCTC AAGCCTC AGAC AGTGGTTC-3 ' ) and VP162R (5'~ 
CAACATGTCCAGATCGAA -3') by denaturation at 95°C for 5min followed by a 
gradient annealing at 60°C to 30°C (0.1 °C per sec reduction in temperature), and gradient 
extension at 30°C to 75°C (0.1 °C per sec, increasing temperature) followed by 29 cycle 
with denaturation at 95°C for 45sec, annealing at 60°C for 45sec and extension at 75°C 

15 for 90sec. The amplified DNA fragment was digested with Sfil and Notl, purified by 
electrophoresis and gel extraction and re-cloned into Sfil and Notl site of pEF-VP16 in 
Step 3. The constructs were verified by restriction enzyme digestion using Sfil and Notl 
and confirmed by DNA sequencing and tested for antigen binding in vivo. 

The construction of pEF-scFvI21R33-VP16 (i.e. scFvI21R33 has CDRs of anti- 

20 RAS scFv 33 and frameworks of anti-RAS scFvI21 except the lysine at VH position 94 
was changed to arginine) was performed by repeated the PCR-assembly-cloning 
procedures described above according to Table IB, starting with Mutl as a template. 
Each round of mutation gave the mutated template for the next round of the stepwise 
footprint mutagenesis using conditions described above. 

25 

Diversification of VH CDR3 by footprint mutagenesis for intrabody library 
construction 

A flow chart outlining the construction of an scFv library based on VH CDR3 
randomisation is shown in Figure 3A. The primers used to randomise CDR3 of the VH 
30 domain in anti-ABL scFvA25 (Tse, E., Lobato, M. N., Forster, A., Tanaka, T., Chung, T. 
Y. G. and Rabbitts, T. H. (2002) J. Mol Biol, 317, 85-94) and partial nucleotide and 
protein sequence of anti-ABL scFvA25 are shown in Fig 3B. The CDR3 randomisation 
was performed using footprint mutagenesis. The template encoding anti-ABL scFvA25 
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was sub-cloned into the pEF-VP16 vector. In step 1, two PCR products were made using 
the pEF-scFvA25-VP16 template viz. the VH domain plus JH using the PCR primers 
EFFP2 plus A25C3B„ and the VL using the PCR primers A25CDR3F plus VP162R. The 
two PCR reactions yield overlapping products (Figure 3A). The A25C3B n comprised 
three distinct oliognucleotides, each with an homologous sequence footprint around 
mutagenic regions of 3, 6 and 10 codons to generate mutations within VH CDR3. 
Amplified PCR fragments (for VH and VL regions) were individually electrophoresed on 
agarose and purified. The two PCR products were assembled in a second PCR reaction 
using oligonucleotides EFFP andVP162R which encompass the whole scFv (i.e. VH and 
VL). The final PCR product was digested with Sfil and Notl and ligated with Sfil-Notl- 
digested pEF-VP16. Ligated DNA were electroporated in the E.coli DH106 (Invitrogen). 
Clones were randomly picked from each final ligation (i.e. from A25C3B3, A25C3B6 
and A25C3B10) and sequenced to verify the insert and the correct integration of CDRs. 
Primer sequences:- 

EFFP2: 5'- GGAGGGGTTTTATGCGATGG-3 ' . 
EFFP: 5 '-TCTCAAGCCTCAGACAGTGGTTC-3 ' 

A25C3B: 5 >_ 

GACGGTGACCAGGGTTCCCTGGCCCC(A/CNN) N TCTCGCACAGTATATTAC-3', 
where n=3, 6 or 10 to randomise amino acid residues in CDR3 of VH domain. 
A25CDR3F: 5 ' -GGGGCC AGGGAACCCTGGTCACCGTC-3 ' . 
VP162R: 5'- CAACATGTCCAGATCGAA-3 '. 

Diversification of VH CDRs by footprint mutagenesis for intrabody library 
construction 

A flow chart outlining the randomisation of the VH CDRs is shown in Figure 4A. Two 
templates were used. One encoding the VH domain from anti-RAS scFvI21R33 and the 
other from the canonical intrabody consensus sequence (Tse et ah, supra), each sub- 
cloned into the pEF-VP16 vector. 

Library 1 (CDR2/3): Randomisation of VH CDR 2 and 3 in each case was done 
by footprint mutagenesis as described above. In the first round of PCR amplification 
(Fig. 4A, step 1), two parts of the VH domain were separately amplified by PCR using 
two pairs of oligonucleotides: EFFP2 plus CDR2R (to randomise CDR2) and CDR2F 
plus CDR3R (to randomise CDR3). Amplified PCR fragments were electrophoresed on 
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agarose and purified. In the second round of PCR (Fig. 4A, step 2), the two PCR 
fragments were assembled using PCR oligonucleotides EFFP2 and JH5R. After 
purification of PCR product, a final PCR (Fig. 4A, step 3) was performed using EFFP and 
NotVHJRl to allow digestion with Sfil and Notl and ligation into yeast pVP16* vector 

5 cut with Sfil + Notl. Ligated DNA was electroporated in competent E.coli DH10B. 
This facilitated the generation of two libraries (each called VH CDR2 /3 library 1) with 
diversities of 2 x 10 6 (I21R33-derived library) and 1.4 x 10 6 (consensus library) 

Library 2 (CDR1/2/3): For randomisation of VH CDR1, the two CDR2/3 libraries 
(library 1) were used as templates. Two PCR reactions were carried out with pair of 

10 oligonucleotides: sFvVP16F plus CDR1R (to randomise CDR1) and CDR1F and 
VP162R (to copy the remaining part of the VH segment) (Figure 4B). The two PCR 
fragments were assembled using sFvVP16F and VP162R, digested with Sfil and Notl 
and ligated into yeast pVP16* vector cut with Sfil and Notl. This facilitated the 
generation of two libraries size of library 2 with diveristies of 3.04 x 10 7 (I21R33-derived 

15 library) and 2.215 x 10 7 (consensus library). Clones were randomly picked up from each 
library and sequenced to verify the insert and the correct integration of CDRs (Fig. 4C). 
Primer sequences :- 

CDR2R:5'-CAGAGTCTGCATAGTATAT(MNN) 5 ACTAATGTATGAAACCCAC-3' 

CDR2F: 5'- ATATACTATGCAGACTCTG -3' 

20 CDR3R: 5 '- 
TCCCTGGCCCC AGTAGTC AAA(MNNMNN)nCCCTCTCGC AC AGTAATAG-3 ' , 

where n=l to 6 to randomise amino acid residues in CDR3 of VH domain. 
JH5R, 5' -GGTGACCAGGGTTCCCTGGCCCCAGTAGTC-3 ' 

NotVHJRl; 5'- 
25 ATAAGAATGCGGCCGCCGCTCGAGACGGTGACCAGGGTTCCCTG-3 ' . 
sFvVP16F, 5'- TGGGTCCGCCAGGCTCCAGG -3'. 

CDR1R: 5 '- 
CCTGGAGCCTGGCGGACCCAMNNC AT(MNN>3 CTGAAGCTGAATCC AGAGG-3 ' 
CDR1F, 5 ' -TGGGTCCGCC AGGCTCC AGG-3 ' 

30 
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Mammalian two hybrid assay in CHO-CD4 using FACS analysis 

Chinese hamster ovary (CHO) cells were grown in minimal essential medium a (a- 
MEM, Invitrogen) with 10% foetal calf serum, penicillin and streptomycin. FACS 
analysis using the CHO-CD4 reporter line (Fearon, E. R., Finkel, T., Gillison, M. L., 
5 Kennedy, S. P., Casella, J. F., Tomaselli, G. F., Morrow, J. S. and Dang, C. V. (1992) 
Proa Natl. Acad Sci. USA, 89, 7958-7962) was performed as described previously (Tse 
et a/., supra) with small modifications. 3 x 10 5 CHO-CD4 cells were seeded in 6 well 
plates on the day before transfection. 0.5 jug of pMl-HRASG12V (DBD-RAS) or pMl- 
flgal (DBD-pgal) and 1 \xg of pEF- scFvR4-VP16 or pEF-scFvconR4-VP16 were co- 

10 transfected into the cells using lipofectAMINE™ according to Manufacture's 
instructions. Forty-eight hours after transfection, cells were washed, dissociated using 
cell dissociation solution (Sigma) and re-suspended in PBS. The induction of cell surface 
CD4 expression was detected using anti-human CD4 antibody (Pharmingen) and FITC- 
conjugated anti-mouse IgG (Pharmingen). The relative fluorescence of the cells were 

15 measured with a FACSCalibur (Becton Dickinson) and the data were processed using the 
CELLQuest software. 

RESULTS AND DISCUSSION 
De novo antibody gene synthesis 

20 The production of antibody V-genes from known protein sequence data was carried out 
by development of a set of overlapping oligonucleotides corresponding to the intracellular 
antibody scFv consensus framework (Tse et al. f supra) together with VH and VL CDRs 
from the anti-Pgal scFv R4 (18) (Figure IB). Annealing and ligation of the mixture of 
oligonucleotides was followed by PGR of the assembled scFv and finally cloning into the 

25 mammalian expression vector pEF-VP16 vector after Sfil-Notl digestion (Figure 1A; 
pEF-VP16). The synthetic scFv was cloned to derive pEF-scFvconR4- VP 1 6 and this was 
sequenced to verify the scFv and its junction with the VP 16 AD domain. The 
effectiveness of the hybrid scFvconR4 as an intrabody was assayed in a reporter assay, 
co-transfecting the pEF-scFvconR4-VP16 clone plus DBD-lacZ (Tse, E. and Rabbitts, T. 

30 H. (2000) Proc. Natl Acad Sci. USA, 97, 12266-12271) into the CHO-CD4 line (this line 
carries a CD4 gene with a minimal promoter regulated by five repeated Gal4 DNA- 
binding sites and can be transcriptionally activated and monitored by expression of cell 
surface CD4 (Fearon, E. R., Finkel, T. ? Gillison, M. L., Kennedy, S. P., Casella, J. F., 
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Tomaselli, G. F., Morrow, J. S. and Dang, C. V. (1992) Proc. Natl Acad. Set USA, 89, 
7958-7962)). When pEF-scFvconR4-VP16 was transfected into CHO-CD4 with a clone 
encoding a Gal4 DNA binding domain (DBD) fused to Pgal, we detected around 
activation of CD4 expression (Figure 1C). This compares with analogous experiments 
5 using the DBD-fJgal with the original scFvR4 (as pEF-scFvR4-VP16). However, no 
activation of CD4 expression was observed with either pEF-scFvconR4-VP16 or pEF- 
scFvR4-VP16 co-transfected with a non-relevant bait, DBD-RAS. This shows that the de 
novo gene synthesis method is efficient for cloning antibody fragments which retain their 
specificity and verifies the consensus framework as an intrabody expression scaffold. 

10 

Footprint mutagenesis to create intrabody diversity 

The intracellular antibody capture method defined an scFv consensus sequence which 
proved particularly advantageous for intracellular use (Tse, E., Lobato, M. N., Forster, A., 
Tanaka, T. 5 Chung, T. Y. G. and Rabbitts, T. H. (2002) J. Mol Biol, 317, 85-94) because 

15 the method selects intrabodies based on in vivo screens (Visintin, M., Tse, E., Axelson, 
H., Rabbitts, T. H. and Cattaneo, A. (1999) Proc. Natl Acad. Set USA, 96, 11723-11728; 
Tse, E., Lobato, M. N., Forster, A., Tanaka, T., Chung, T. Y. G. and Rabbitts, T. H. 
(2002) J. Mol Biol, 317, 85-94; Visintin, M., Settanni, G., Maritan, A., Graziosi, S., 
Marks, J. D. and Cattaneo, A. (2002) J. Mol Biol, 317, 73-83). One specific antibody 

20 derived using this method was scFv33, which is an anti-RAS antibody able to bind RAS 
in mammalian cells. A second scFv, scFvI21 ? was derived from a RAS yeast screen but 
did not bind RAS in mammalian cells, although its expression level was superior to 
scFv33. We wished to assess the importance of specific IAC consensus framework 
residues and we have used a PCR-based mutagenesis procedure (herein called footprint 

25 mutagenesis) to make mutations in the scFv33 framework (Table 1 A) for evaluation in a 
mammalian cell luciferase reporter assay (Tse, E., Lobato, M. N., Forster, A., Tanaka, T., 
Chung, T. Y. G. and Rabbitts, T. H. (2002) J. Mol Biol, 317, 85-94). With a series of 
changes, scFv33 framework was effectively converted to scFvI21 in a step- wise manner 
(Table 1) which exemplifies the consensus framework and which retains parental ability 

30 to bind to RAS antigen (summarised in Table 1A). The templates scFv33 and scFvI21 
were cloned into pEF-VP16 to be used as PCR templates and sequential mutagenesis was 
carried out. Full conversion required seven rounds of PCR, assembly and cloning (Table 
IB; note the only change to the scFvI21 framework was the lysine residue at position 94 
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was changed to an arginine, consistent with the canonical intrabody consensus). At each 
step a new template was created and sequenced to verify the specificity of the PCR and 
each mutation was tested for function. For sequential mutation, each round provides the 
template for next mutagenesis step (Table IB; Figure 2). With this approach, a new scFv 
could be created with the framework of scFvI21R and the CDRs of the scFv33 
(scFvI21R33). 

Using footprint mutagenesis to diversify CDRs and create intrabody libraries 

Footprint mutagenesis can be applied to create single or small changes in specified 
region. We have used this method to diversify one CDR in an scFv (VH CDR3 of the 
anti-ABL scFvA25 (Tse, R, Lobato, M. N., Forster, A., Tanaka, T., Chung, T. Y. G. and 
Rabbitts, T. H. (2002) J. Mol Biol, 317, 85-94)) to generate a library of different 
sequences. Footprint mutagenesis was achieved in this case using as the template 
scFvA25 cloned in pEF-VP16 and an internal mixture of PCR primers covering scFv VH 
CDR3 in which the oligonucleotides contain three, six or ten codon equivalence of 
randomised sequence (Figure 3 A, B; primer A25C3B n ). After the first PCR, the two PCR 
products were assembled (Figure 3A, 2 nd PCR) and the final product was cloned into 
pEF-VP16 to create a library of individual clones. Randomly picked clones were 
sequenced in the CDR3 region (Figure 3C) confirming that the CDR3 has been changed 
by 3, 6 or 10 residues respectively. Thus this simple footprint mutagenesis method 
generates high degrees of diversity which is directly related to design of the primers used 
for the PCR steps. 

In the procedure illustrated in Figures 2 and 3, each step introduced mutation(s) at 
only one position in the scFv. Mutations can be made at two, and potentially more, 
positions by using mutant oligonucleotides for each PCR step, prior to assembly. This is 
shown in Figure 4 which illustrates simultaneously randomising the CDR2 and CDR3 
regions of a VH template and subsequent randomisation of CDR1 (the same strategy 
would apply to mutagenesis of VL). In the examples for CDR2/3 changes, two VH sub- 
region PCR reactions were carried out, each using a fixed sequence primer together with 
a randomising primer (Figure 4A, EFFP2 + CDR2R or CDRF + CDR3R, where the two 
reverse primers are mutagenic for CDR2 and CDR3 respectively). The two PCR products 
overlap in the CDR2 region and a PCR assembly was achieved using the flanking 
primers, followed by final PCR with EFFP plus NOTVHJR1 for cloning into the Sfil- 
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Notl site of yeast pVP16* vector. Sequences of a selection of clones showed diversity of 
the CDR2 and CDR 3 regions (Figure 4C). A mixed library of 3.4 X 10 6 clones from the 
above was used as a substrate for a second round of mutagenesis at CDR1 using the pairs 
of primers EFFP2 plus CDR1R (a primer for mutagenic CDR1) and CDR1F plus 
VP162R. Sequences of a selection of clones showed diversity of the CDR1, as well as at 
CDR2 and CDR 3 regions (Figure 4C). This two step procedure thus allows production 
of randomised CDR1, 2 and 3. It should be possible to devise a similar protocol for* 
simultaneously mutating CDR1, 2 and 3. 

In summary, the de novo antibody gene synthesis method and footprint 
mutagenesis are powerful tools for making antibody genes and the acquisition of 
immunoglobulin mutants, for instance for affinity maturation which would involve CDR 
changes. We show that the de novo antibody gene synthesis method is a simple, 
oligonucleotide-based annealing, ligation and PGR procedure to make an antibody 
fragment suitable for cloning into a compatible vector. In our specific use of de novo 
intrabody production, a hybrid scFv was made in which the intracellular antibody capture 
(IAC) consensus was the scaffold and an anti-pgal antibody (Martineau, P., Jones, P. and 
Winter, G. (1998) JMol Biol, 280, 1 17-127) provided the CDR sequences. We chose the 
IAC consensus because it is advantageous for mammalian in-cell expression and anti- 
Pgal antibody because it was specially developed in bacteria for soluble expression. Our 
hybrid scFv was able to bind to its target antigen in CHO cells with comparable 
efficiency to the parental scFv (Figure 1C). This adds further validation to the IAC 
consensus scaffold as a suitable intrabody scaffold and shows that de novo intrabody gene 
production is viable. Diversification of the antibody fragments was carried out by 
footprint mutagenesis allowing one or two step conversion of a V-gene. Generation of 
whole intrabody libraries was achieved by this means. Thus intrabody libraries can be 
made starting with the consensus intrabody sequence and using these two simple in vitro 
methods. These libraries are ready for direct in vivo screening with any antigen that can 
be made as a bait in a suitable yeast two-hybrid vector. 
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Table 1. Templates and primers for stepwise footprint mutagenesis to convert scFv33 to 
scFvI21R33 
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The primers were used in footprint mutagenesis, illustrated in Figure 2, of the framework 
of scFv33 sequence, to convert it to the I21R33 sequence. 

A. At the first round, both pEF-scFv33-VP16 and pEF-scFvI21-VP16 were used as 
templates. Individual mutations were incorporated with the primers, as indicated. 
5 B. At subsequent rounds, the PCR template used was the previously mutated version, 
except round 2 in which either Mutl or pEF-scFvI21R-VP16 were used. 



All publications mentioned in the above specification are herein incorporated by 
10 reference. Various modifications and variations of the described methods and system of 
the invention will be apparent to those skilled in the art without departing from the scope 
and spirit of the invention. Although the invention has been described in connection with 
specific preferred embodiments, it should be understood that the invention as claimed 
should not be unduly limited to such specific embodiments. Indeed, various 
15 modifications of the described modes for carrying out the invention which are apparent to 
those skilled in molecular biology or related fields are intended to be within the scope of 
the following claims. 
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1. A method for preparing a double stranded nucleic acid which encodes an 
immunoglobulin, comprising the steps of: 

5 (a) providing a set of three or more overlapping oligonucleotides which anneal to 

form the + and - strands of a nucleic acid which encodes at least part of an 
immunoglobulin variable domain; 

(b) annealing the oligonucleotides; 

(c) replicating the + and - strands of the nucleic acid formed from the annealed 
10 oligonucleotides; and 

(d) inserting the nucleic acid into an expression vector. 

2. A method according to claim 1 5 wherein the double stranded nucleic acid encodes 
an immunoglobulin variable domain. 

15 

3. A method according to claim 1 or claim 2, wherein the variable domain is a Vh 
domain. 

4. A method according to claim 1 or claim 2, wherein the variable domain is a Vl 
20 domain. 

5. A method according to claim 1 or claim 2, wherein the immunoglobulin is a scFv 
and the nucleic acid encodes both a Vh domain and a V L domain, linked by a linker 
sequence. 

25 

6. A method according to any preceding claim, wherein the nucleic acid further 
encodes at least one effector domain or a label. 

7. A method according to any preceding claim, comprising the further steps of: 

30 (e) amplifying the nucleic acid encoding the immunoglobulin variable domain 

using a first set of at least four primers, of which two primers are overlapping, and 
wherein one of the overlapping primers consists of a plurality of different primers, each 
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of which comprises a mutagenic region which generates a mutation in a part of the 
nucleic acid sequence; 

(f) purifying the amplification products thus obtained; 

(g) assembling the amplification products in a further amplification reaction using 
5 a second set of primers which encompass the entire nucleic acid which encodes the 

immunoglobulin variable domain; and 

(h) inserting the assembled amplification product into an expression vector. 

8. A method according to claim 7 wherein the nucleic acid and assembled 
10 amplification product which is finally obtained after the mutagenesis procedure encode 

the gene product which it is desired to obtain in its entirety. 

9. A method according to claim 7 or claim 8, wherein primers are used which 
comprise a multiple codon equivalence of randomised or semi-randomised sequence. 

15 

10. A method according to any one of claims 7 to 9, wherein steps (e) to (g) are 
repeated, using different primers, to mutate different regions of the nucleic acid and thus 
generate a library of diversified nucleic acid molecules which encompass mutations in a 
plurality of regions. 

20 

11. A method according to any one of claims 7 to 9, wherein two or more diversified 
primer sets are used simultaneously to diversify two or more separate regions. 

12. A method according to any one of claims 7 to 11 wherein two or more PGR 
25 products are generated which overlap. 

13. A method according to any one of claims 7 to 12, wherein multiple codon 
equivalence extends over two to twelve codons. 

30 14. A method for preparing a library of nucleic acids encoding a diversified 
immunoglobulin, comprising the steps of: 

(a) amplifying a nucleic acid encoding the immunoglobulin using a first set of at 
least four primers, of which two primers are overlapping, and wherein one of the 
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overlapping primers consists of a plurality of different primers, each of which comprises a 
mutagenic region which generates a mutation in a part of the nucleic acid; 

(b) purifying the amplification products thus obtained; 

(c) assembling the amplification products in a further amplification reaction using 
a second set of primers which encompass the entire nucleic acid which encodes the 
immunoglobulin; and 

(d) inserting the assembled amplification product into an expression vector. 

15. A method according to claim 14 wherein the nucleic acid and assembled 
amplification product which is finally obtained after the mutagenesis procedure encode 
the gene product which it is desired to obtain in its entirety. 

16. A method according to claim 14 or claim 15, wherein primers are used which 
comprise a multiple codon equivalence of randomised or semi-randomised sequence. 

17. A method according to any one of claims 14 to 16, wherein two or more 
diversified primer sets are used simultaneously to diversify two or more separate regions. 

18. A method according to any one of claims 14 to 17 wherein two or more PCR 
products are generated which overlap. 

19. A method according to any one of claims 14 to 18, wherein multiple codon 
equivalence extends over two to twelve codons. 

20. The method of any preceding claim, further comprising the steps of causing the 
expression vector to produce the nucleic acid molecule and isolating the immunoglobulin 
thus produced. 

21. An immunoglobulin produced by expression of an expression vector according to 
any one of claims 1 to 19. 



22. A library produced by the method of any one of claims 1 4 to 1 9. 
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FIG. 4 
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